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Protein kinases work because their flexibility allows to continuously switch from inactive 
to active form. Despite the large number of structures experimentally determined in 
such states, the mechanism of their conformational transitions as well as the transition 
pathways are not easily to capture. In this regard, computational methods can help to 
shed light on such an issue. However, due to the intrinsic sampling limitations, much 
efforts have been done to model in a realistic way the conformational changes occurring 
in protein kinases. In this review we will address the principal biological achievements 
and structural aspects in studying kinases conformational transitions and will focus on the 
main challenges related to computational approaches such as molecular modeling and MD 
simulations. 
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INTRODUCTION 

Protein kinases, one of the largest gene families in eukaryotes, are 
a class of enzymes which regulates several key cellular processes, 
such as cell growth and differentiation. 

They all achieve their function by favoring particular structural 
arrangements, able to catalyze protein phosphorylation. Despite 
the high degree of the conservation of the overall fold, each kinase 
is able to respond to specific signals tuning its activity. 

Therefore, these molecules represent a unique example of 
the nature's ability to introduce diversification through relatively 
small variations to a highly conserved motif. 

Such variations, which rarely modify the optimized structure 
on the protein free-energy minimum, can severely affect pro- 
tein flexibility which in turn might modify the transition patterns 
designed to be optimal for their function. 

Protein kinases probably represent one of the most remark- 
able examples of the importance of the protein flexibility: in fact, 
they regulate — continuously switching from active to inactive 
state — multiple biological processes by posttranslational phos- 
phorylation of serine, threonine and tyrosine residues. In this 
context, the knowledge of their structures, although mandatory, 
it is not sufficient for a complete understanding of their function, 
thus, the description of their conformational transition pathways 
is of paramount importance. 

However, the experimental techniques able to accurately 
describe the transient/intermediate states populated between the 
conformational changes are still under development (Ban et al., 
2011; Waldrop, 2014), thus, computational methods can greatly 
help in this case. 

Atomistic Molecular Dynamics (MD) simulations — based on 
the classical mechanics laws and using well-refined empirical 



force fields — are probably the most accurate computational tech- 
nique for the study of protein flexibility, being able to describe 
the dynamical evolution of molecular systems at atomistic level 
of detail. Unfortunately, the time-scale reached by standard MD 
simulations is still limited by the computational power to the 
microsecond, thus excluding processes occurring on longer times, 
such as the conformational transitions of almost all kinases. 
Therefore, several theoretical-computational techniques able to 
enhance the conformational sampling were applied to model such 
a process. 

Here, after presenting the structural features of protein kinases, 
we discuss some recent results obtained in modeling kinase con- 
formational transitions by enhancing sampling techniques and 
brute-force approaches, highlighting their advantages and lim- 
its. Finally, due to the involvement of this protein family in very 
important diseases such as cancer, we briefly introduce the results 
reached in drug design using computational techniques able to 
provide a complete description of the accessible conformational 
space. We discuss their still full unexploited potential to further 
enhance the kinase inhibitor performance. 

KINASE STRUCTURE AND CONFORMATIONAL VARIABILITY 

To get insights into the structure/dynamics/function relationship 
we need to describe their molecular structure in detail. Although 
the number of human protein kinase family members is large, 
the existing X-ray crystallographic studies showed that the three- 
dimensional (3D) structures of their catalytic domains are similar. 
Studies based on X-ray structures of PKA revealed that catalytic 
subunits of protein kinases share a conserved core consisting of 
two lobes. The N-terminal small lobe (N-lobe) and C-terminal 
large lobe (C-lobe). These two lobes host a deep pocket that 
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accommodates a molecule of ATP bound to one or two diva- 
lent cations: magnesium or manganese that compensate for the 
strong negative charge of the ATP phosphates (Kornev and Taylor, 
2010). This special structural arrangement is required because the 
phosphoryl transfer reactions are highly depend on the bond dis- 
tances and charge distribution in the reaction transition state. 
In particular, structural studies performed with transition state 
mimics showed that the presence of Mg2+ ions and positively 
charged groups in its vicinity would imply an associative mecha- 
nism for phosphoryl transfer, thus optimizing the overall reaction 
(Madhusudan et al, 2002). The N-lobe usually includes five 
f5-strands and an a-helix (hereafter called ctC-helix). Despite the 
fact that the ^-strands form a relatively rigid antiparallel P sheet, 
the N-lobe is very dynamic and flexible. The C-lobe is mainly 
a-helical (Figure 1) and includes the activation segment, a 20-35 
residues stretch located between a conserved DFG motif, and 
the APE motif (Huse and Kuriyan, 2002; Nolen et al, 2004), 
that is highly conformationally variable and its conformation 
can influence both substrate binding and catalytic efficiency. The 
C-lobe serves as a docking site for substrate peptides/proteins. 
The N-terminal part of the peptide lies in a groove between the 
aD and aF-helices on one side and the ctG-helix on the other side. 

Bioinformatics approaches (Kornev et al, 2006) have shown 
that key hydrophobic residues in kinases have a strict organiza- 
tion around cores. In particular, the overall fold is joined by two 
hydrophobic non-contiguous motifs referred as "spines" (Kornev 
et al, 2008). The spines connect all critically important elements 
of the protein kinase molecule to a single hydrophobic helix 
(aF) that provides their positioning in space. These motifs regu- 
late the protein kinase activity. The regulatory spine is typically 
assembled after phosphorylation loop and the catalytic one is 




FIGURE 1 | Ribbon representation of the molecular fold of the CDK2 in 
the closed (cyan) and open conformation (orange). The structures were 
extracted from molecular dynamics trajectories (Besker et al., 2013). 



completed by the ATP adenine ring. All these are anchored to the 
buried hydrophobic aF helix of the C lobe and provide the basic 
architecture of the core. 

The "activation loop" is a highly flexible loop, which adopts an 
extended conformation in the active kinase domain to function 
as a binding platform for the peptide substrate. The experimen- 
tal structures of kinases show that ctC-helix (in the small lobe) 
and the activation loop (in the large lobe) adopt distinct con- 
formations associated with the active or different inactive forms. 
This extended conformation is stabilized by phosphorylation, as 
observed in the active and phosphorylated kinase domain of the 
Src-family kinase Lck (Hantschel and Superti-Furga, 2004). In 
contrast to the active kinase, the inactive kinases are structurally 
diverse (Noble et al., 2004). This diversity arises because no cat- 
alytic requirements constrain the fold when it is inactive, with the 
presence of different inactive conformations that, nevertheless, 
share a number of common structural themes. 

In many kinases aC-helix can rotate and change its position, 
thereby altering the orientation of key catalytic residues. In the 
active state of EGFR kinase, the activation loop maintains a (39 
strand and an overall conformation compatible with substrate 
binding while the aC-helix is adjacent to the ATP-binding site 
(the "aC-in"conformation), and the catalytic important Asp831 
residue of the conserved "DFG" motif is found within that site 
("DFG-in"). That is, in the active state, the phenylalanine (Phe) 
side-chain occupies the ATP-binding pocket, and the aspartate 
(Asp) side-chain is located in the outside of the pocket. When 
the so-called "DFG-flip" occurs, the Asp and Phe residues swap: 
the Asp side-chain rotates into the ATP-binding pocket, and the 
Phe side-chain rotates out of the ATP-binding pocket (DFG-out 
conformation), leading the kinases to the inactive state (Shan 
et al., 2009). Some human kinases were shown to be able to 
adopt the DFG-out conformation (Debondt et al, 1993; Sicheri 
et al, 1997; Xu et al, 1997) and it was suggested that the DFG- 
in and DFG-out conformations are in dynamic equilibrium and 
can interconvert to determine the kinase function (Levinson et al., 
2006). 

The protein kinase fold may switch between an active or inac- 
tive conformation by extra domains or separate subunits. How 
the active and inactive states are stabilized and how states inter- 
convert are key questions in understanding kinase regulation. 
Typically, the activation segment contains a serine/threonine or 
tyrosine residue that can be phosphorylated. This site is referred 
to as first "phosphorylation site"; some kinases can have up to 
three phosphorylation site in the activation loop and perform 
function specific for these kinases. Other kinases, called non-RD 
kinases, lack the conserved arginine residue in kinase subdomain 
and it has been observed that some of them do not autophospho- 
rylate the activation loop and are either constitutively active or 
regulated through alternative mechanisms (Dardick and Ronald, 
2006). Most protein kinases are comprised of more than just 
the kinase domain. This can be flanked by other domains which 
tend to be quite dynamic or are part of a multi-subunit com- 
plex such as the Cyclin-dependent Kinases (CDK) which require 
an activating cyclin or phosphorylase kinase. Others such as the 
receptor tyrosine kinases are anchored to membranes and often 
have long segments that tether them to the membrane as well 
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as long C-terminal tails. Certain protein kinases can be acti- 
vated or inhibited by specific polypeptide cofactors. In CDK2 the 
activation depends on association with a cyclin subunit via the 
interaction of the aC-helix with specific hydrophobic residues on 
cyclin that enable the active site to be accessible for phospho- 
rylation (Lowe et al., 1997; Lolli, 2010). Many protein kinases 
dimerize as part of their activation mechanism, and dimerization 
can be regarded as a special case of kinase activation by acces- 
sory proteins or domains. In such cases, either both partners are 
activated by reciprocal phosphorylation or one partner (the acti- 
vator kinase) activates the other (the receiver kinase) through an 
allosteric mechanism (Endicott et al., 2012). 

COMPUTATIONAL STUDIES: AN OVERVIEW 

Computational works focusing on kinase conformational transi- 
tions can be divided in two groups: those based on brute-force 
approach, where the system was simulated for microseconds and 
those where the dynamical evolution of the system was altered 
to speed up the sampling. Using the former approach and a 
special-purpose supercomputer, the group of D.E. Shaw described 
spontaneous transitions in the microsecond time-scale for the 
kinase domain of the EFGR from the active to the so-called "Src- 
like inactive" conformation by way of two sets of intermediate 
conformations, significantly different from the existing crystal 
structures. Interestingly, in one of them the helical arrangement 
of the activation loop (A-loop) leaves the well-known phospho- 
rylation site exposed (Shan et al., 2013). Similarly, Yang et al. used 
several independent MD trajectories starting from the putative 
intermediate conformations to reconstruct the transitions path- 
ways between active and inactive states for the Src kinase and 
identified the concerted conformational rearrangements in terms 
of structural regions of the protein, such as the A-loop and the 
aC-helix (Yang et al, 2009). They also found that transient struc- 
tures in not fully active conformations allow exposure of Tyr416 
to the bulk solvent — the residue whose phosphorylation locks the 
Src in the active state — a feature typical of the fully active states. 
In an extensive application of the Replica Exchange MD — an 
enhancing sampling technique favoring the overcome of poten- 
tial energy barriers — Huang et al. found that the aC-helix acts as 
an energy switch between the active and inactive conformations 
and describe the sequence of events along the optimal inactiva- 
tion path as the fold of the A-loop into the active site followed by 
the aC-helix movement (Huang et al., 2012). 

Due to the large computational requirements, brute-force 
approaches are still limited to few research groups and inevitably 
to few systems. Therefore, a class of theoretical-computational 
methods able to sample the conformational rearrangements at 
limited computational costs were applied to protein kinases. 
Perturbative approaches, which are likely to be the most powerful 
techniques to model the free energy of a chemical reaction, are 
not particularly suited in this case, mainly because it is very diffi- 
cult to find a proper reaction coordinate in a high-dimensional 
space able to fully discriminate the two conformational states. 
Three of such methods, the Targeted, Steered and Biased MD 
were applied to model the conformational transitions between 
the active/inactive forms of the catalytic domain of LYN kinase, 
a member of the Src family of protein tyrosine kinases: the results 



showed that although the transition pathways described by these 
approaches are similar, the path is strongly dependent on the 
choice of progress variable (Huang et al., 2009). 

One of the most interesting approaches, however, is based on 
the possibility to iteratively select the conformations sampled by 
unbiased MD which are most likely to be productive, in order to 
limit the time spent in sampling the conformational space regions 
within the free-energy minimum basin. In this regard, Beckstein 
et al. applied the RMSD as a measure to apply their Dynamic 
IMportance Sampling (DIMS) algorithm to describe the zipping 
and unzipping of adenylate kinase (Beckstein et al., 2009). From 
their results, a cooperative salt bridge zipper is hypothesized to be 
the rate-limiting step of the apo-AdK conformational transition. 

Following a similar approach, our group applied the Essential 
Dynamics algorithm to the description of the conformational 
transitions in the CDK2 (Besker et al, 2013). The power of such a 
technique is due to the fact the direction in the high-dimensional 
space is described by the native protein movements in its sta- 
ble conformational states, e.g., open and closed. In this case, the 
choice of the progress variable(s) is natural and self-defined by 
the unbiased MD simulations of the starting and target state. In 
the case of CDK2, we found that both the opening and closure 
follow common transition paths in the essential subspace, involv- 
ing the same structural determinants (i.e., the ctC-helix and the 
activation loop) but in opposite order, indicating that the time- 
sequence of such rearrangements determines the specific protein 
conformational transition. It is worth noting that due to their 
high computational efficiency, the last two methods can be used 
to fill the gap between single-system and system-wide studies, 
thus providing an unvaluable source of information at the protein 
family level. 

All the above methods, with their advantages and shortcom- 
ings, represent an intriguing field of research and, in the case of 
the kinase conformational transitions, they have contributed to a 
better description of the transition pathways between the active 
and inactive forms. 

Nevertheless, several important aspects from a biochemical 
point of view are still not well-characterized in detail, including 
for example the possibility to bridge the gap between the atomic 
description of the transitions and the protein activity and/or the 
role played by possible biological partners. In the next few years, 
with the support of new or refined enhancing sampling algorithm 
and the fast growth of the computational power, we envision 
that a better modeling of large conformational rearrangements of 
biomolecules, as those occurring in kinases, will be within range 
of the scientific community. 

INHIBITORS 

The central biological role of protein kinases leads to their fre- 
quent alteration in several pathologies, first of all cancer. It is 
not surprising, therefore, that they are major targets for therapy. 
The great number of family members and their high degree of 
conservation, however, easily lead to multi target inhibition and 
therefore important side effects. First-generation CDK inhibitors, 
for instance, targeted the non-specific ATP-binding site and were 
discarded in the pre-clinical phase because of their toxic effects 
(Lapenna and Giordano, 2009). 
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The importance of kinase structural flexibility on the rational 
drug design of their inhibitor can be appreciate looking at the 
experience of imatinib, the first Bcr-Abl inhibitor approved by the 
US Food and Drug Administration that revolutionized treatment 
of Chronic Myelogenous Leukaemia (CML). At the time of this 
drug discovery, the structural knowledge on the kinase family was 
not deep enough to rationalize its mode of action, but several sub- 
sequent studies demonstrated that imatinib exploits the change of 
conformation of the DFG-motif in the unphosphorylated inactive 
form (Lambert et al., 2013). These second generation inhibitors, 
targeting a variable region in the inactive form, are more capa- 
ble of discerning among kinases but at the same time have the 
Achilles' heel of a greatest rate of drug-resistant mutations (Noble 
et al, 2004). 

First generation inhibitors, in fact, targeting the active con- 
formation, seldom developed drug- resistant mutations because 
they would have compromised the kinase activity. The war against 
cancer, therefore, moved another step, going against imatinib 
resistant cancers. 

A growing number of experimentally solved kinase structures 
(more than 5000 in PDB database), both in active and inac- 
tive forms, and several MD simulations at microsecond-timescale 
have changed our knowledge on this protein family. These deep 
roots made possible the successful application of in silico ratio- 
nal drug design in a growing number of cases: (1) the so-called 
hybrid-design was carried out, in which the same kinase inhibitor 
targets different protein regions, thus merging the characteristic 
of potency (typical of first generation inhibitors) and selectiv- 
ity (obtained by second generation ones) (Albaugh et al., 2012); 
(2) imatinib was also reengineered to reduce its cardiotoxicity, by 
making use of a different water propensity of two residues in the 
principal and secondary target kinases (Fernandez et al., 2007). 
Several MD simulations were carried out at this scope, that were 
able to discriminate the dynamic and structural characteristics of 
Bcr-Abl and C-Abl kinases, this last responsible for the cardiotoxic 
effects; (3) systematic rigid body docking of potential inhibitors 
against 84 unique protein kinases identified three derivatives of 
indirubin (Zahler et al., 2007), one of which has been recently 
indicated as particularly active against cancer metastasis (Braig 
etal, 2013). 

The research on better kinase inhibitors by computational 
methods has focused not only on the target but on the ligand 
as well, with the so-called Pharmacophore Modeling approach, 
i.e., "an ensemble of steric and electronic features that is nec- 
essary to ensure the optimal supramolecular interactions with a 
specific biological target and to trigger (or block) its biological 
response" ( Wermuth et al., 1998). Modern drug design combines 
more computational approach, as in the case of the identifica- 
tion of novel inhibitors against thymidine monophosphate kinase 
of M. tuberculosis with potent antitubercular activity, obtained 
through the combination of pharmacophore modeling, docking 
simulation, and structure interaction fingerprint analysis (Kumar 
etal, 2009). 

It is easy to predict that these coming years will still be rich 
of new kinase inhibitors with improved selectivity and mutant 
resistance. In fact, new targets, coming from the computational 
and experimental characterization of kinase intermediates 



are combined with improved computer-aided drug design 
methods. 

CONCLUSIONS AND PERSPECTIVES 

As briefly highlighted in this review, the intrinsic flexibility of 
the kinases represents a very intense field of research. The recent 
developments in theoretical-computational modeling are con- 
tributing to elucidate the actual dynamical behavior of the kinase 
domains, thus leading to the possibility of enhancing the design 
of new and selective inhibitors using a proper atomistic descrip- 
tion of their conformational basins and the pathways along which 
they continuously move to tune their activity. Finally, the growth 
of computational power as well as the development of new the- 
oretical treatments able to model slow and/or rare events at 
limited computational costs are making it possible to switch from 
single-system modeling of the kinase dynamical behavior to a 
system-wide modeling. 
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